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(1.) Introduction. 

In dealing with the problem of the relationship of attributes, not capable ol 
quantitative measurement, it has been usual to classify the two attributes into a 
number of groups, Aj, Aj, A3, ... A, and B^, B^, Bg, . . . B^. In this manner a table 
has been formed containing s columns and t rows, or s X ^ compartments. The total 
frequency of the population, or of the " universe " under consideration, to use the 
logician's phrase, is then distributed into sub-groups corresponding to these s X t 
compartments. In simple cases of association, as in that of the presence of the 
vaccination cicatrix and the recovery from an attack of smallpox, s and t are both 
equal to two, and we have a simple four-fold division of the universe. In other cases 
we have higher numbers, as when we classify the human eye into eight colour classes 
and correlate these classes with six or more classes for hair colour. We may even 
run up to as many as 18 to 25 classes for each attribute when we table the coat 
colours of thoroughbred horses or pedigree dogs in the case of pairs of blood relatives, 
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Hitherto, in order to obtain a measure of the degree of correlation or association, we 
have proceeded on the assumption that it was necessary to arrange the system^ ot 
classes like A„ A., ... A, in some order, which corresponded to a real quarititative 
scale in the attribute, although we were unable to use this scale directly. .Thus one 
arranged eye-colours in what appeared to correspond to a scale of varymg amounts 
of orange pigment ; the coat colours of horses were arranged in an order correspondmg 
fairly to what an artist would call their " value." I even analysed hair tints by 
photographic processes. In all such cases the order seemed of vital importance. 
Once this order was settled, the methods of my memoir* on the correlation of 
characters not quantitatively measurable could be applied— the actual scale corre- 
sponding to the classification could be deduced, and we were able, on the assumption 
of normal frequency, to actually plot the regression lines for the correlation of a 
variety of attributes.! The conception, however, of order in the classification Avas 
at times very hampering. Take three broad classes like those for human t.emper — 
quick tempered, good natured, and sullen ; it is difiicult to grasp the exact meaning 
of a quantitative scale at the basis of this classification, and it is not obvious that the 
right order is necessarily that with good-natured in the middle. Or, again, take the 
case of human hair ; omitting the brown reds, we can get a practically continuous series 
of shades from jet black to flaxen, and from flaxen with increasing red up to the 
deepest reds. Only the brown reds come in and upset the system ! We seem, 
therefore, forced to take a double scale, first one of black, and then one of red 
pigment. Or, again, take the coat colour of greyhounds ; these are classified into as 
many as 40 fairly narrow groups, and we can arrange these groups in ascending 
order of red, or black, or other pigmentation. We have more than one possible scale. 

Now in recent work on such things as temper in man, eye colour in man, and hair 
colour in man or other animals, I have proceeded to arrange my groups in two or 
three different orders, and to calculate the correlation on the basis of these, 
different orders. The results for the different orders came out in rather striking 
agreement, and the first sort of conclusion that one was tempted to draw was, for 
example, that the inheritance of pigmentation was strikingly alike for all pigments. 
But the agreement was in some cases far closer than one is accustomed to find when 
one compares the inheritance of directly measurable characters, and I soon became 
convinced that owing to some important theoretical law hitherto overlooked, the 
order of the groups by which we classify our attributes is a matter of no importance 
when we are determining correlation. The group order is all important for variation, 
it has practically no influence on correlation. We may put sullen tempers where we 
please in regard to quick and good-natured ; we may place the shades of red hair at 
either end of the hair scale or in the middle, and the inheritance coefficient will come 

* 'Pliil. Trans.,' A, vol. 195, pp. 1-47. 

t For example, for health and ability and for the correlation of the psychical and physical characters, 
see the " Fourth Annual Huxley Lecture," 'Journal of the Anthropological Institute,' vol. 33, pp. 194-195.' 
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put nearly the same in value. Nay, we may go further, and classify finger prints 
like Mr. Galton into "tents," " arches," "whorls," " croziers," &c., &c., and still be 
able to find a numerical value of the degree of resemblance between two blood 
relatives,, although any arrangements of such groups into a possible quantitative 
scale may be inconceivable. The object of this present paper is to deal with this 
novel conception of what I have termed contingency, and to see its relation to our 
older notions of association and normal correlation. The great value of the idea ot 
contingency for economic, social, and biometric statistics seems to me to lie in the 
fact that it frees us from the need of determining scales before classifying our 
attributes. I shall endeavour to illustrate the importance of this freedom in the 
illustrations which follow the theoretical treatment of the subject. 

(2.) On the Conception of Contingency. 

In mathematical treatises on algebra a definition is usually given of independent 
probability. If p be the probability of any event, and q the probability of a second 
event, then the two events are said to be independent, if the probability of the 
combined event he p X q. Now let A be any attribute or character and let it be 
classified into the groups Aj, Aj, . . . A„ and let the total number of individuals 
examined be N, and let the numbers which fall into these groups be 7'-i, n.2, . . . ng 
respectively. Then the probability of an individual falling into one or other of these 
groups is given by n^/N, 'i^/N, . . . nj,/N respectively. Now suppose the same 
population to be classified by any other attribute into the groups B^, Bg, . . . B/, and 
the group frequencies of the N individuals to be mj, wij, . . . m^ respectively. The 
probability of an individual falling into these groups will be respectively Wj/N, m^/N, 
wig/N, . . . m^/N. Accordingly the number of combinations of B„ with A„ to be 
expected on the theory of independent probability if N pairs of attributes are 
examined is 

WW N^ ~ •^' 

Let the number actually observed be n,,^. Then, allowing for the errors of random 

sampling, 

_ 7VW_„ _ ■ _ 

'^ "vr — ?'y ^t'v 

is the deviation from independent probability in the occurrence of the groups A„, B„. 
Clearly the total deviation of the whole classification system from independent 
probability must be some function of the n,„, — v„ quantities for the whole table. I 
term any measure of the total deviation of the classification from independent 
probability a measure of its contingeyicy. Clearly the greater the contingency, the 
greater must be the amount of association or of correlation between the two 
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attributes, for such association or correlation is solely a measure from another 
standpoint of the degree of deviation from independence of occurrence. 

Now it must be quite clea]> that if we make our measurement of contingency any 
function whatever of such quantities as «„„ - z^„„, its magnitude will be absolutely 
independent of the order of classification, i.e., its value will be unchanged if we 
re-arrange the A's and the B's in any manner whatever. This is the fundamental 
gain of this new conception of contingency. But precisely as we can measure 
position or acceleration in a great variety of ways, so it is possible to measure 
contingency. We must try to select out of these ways those which : {a) bring 
contingency into line with the customary notions of correlation and association ; and 
(&) permit of not too laborious calculations leading to the required measure. 

We will consider these points at some length. I have shown in a paper,* " On 
Deviations from the Probable in a Correlated System of Variables," that if m\, 
m\, . . . m'„ be any system of observed frequencies and Wp m.^, . . . m„ be any system 
of theoretical frequencies known a priori, then if 

X^ = Sum |(!^~^^H from q = to n 

be calculated, we can deduce a quantity P from x^ which is the probability that in 
any trial a system m,'\, m'\, . .. . m"„ of observed frequencies will occur, which 
deviates more from mj, wij, . . . m„ than the actually observed system does. Tables 
have been worked out by Mr. Palin Elderton giving the value of P, for a 
considerable range of values of ^^ and n, and have been published in ' Biometrika.'f 

Now it will be obvious that if we want to measure contingency, we really want to 
measure the deviation of the observed results from independent probability, and 
therefore if we take m^, m^, . . . m,, to correspond to the system Vuv and m\, rr^^, . . . m'„ 
to correspond to the actually observed system «„„, 

^2 ^^\{ nu,-v„„ f\ ^.^^^ 

will be a proper quantity to calculate, and P would measure how far the observed 
system is or is not compatible with a basis of independent probability. If P be large 
the chances are in favour of the system arising from independent probability ; if P be 
small there is certainly association between the attributes. Hence 1 — P would be a 
proper measure of the contingency. 1 propose to call 1 — P the contingency grade. 
Further, it is convenient to have a name for a function closely related to v^. I shall 
call 

f' = xVN (ii.) 

.the xnean square contingency. 

* • Phil. Mag.,' July, 1900, pp. 157-175. 
t Vol. I., p. 155. 
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It will be seen that, in the method by which we have approached the problem, we 
have not had to consider the question of the sign of the contingency like '/i«„ — Vm; 
our mean square contingency is based on a summation of squares extending to all the 
A- X t compartments of the table. But if we treat now of quantities like «„„ — v„, 
their total sum must be zero, since for the whole table 

S(n„„) = N = ^{Vuv}- 

Let us suppose that the symbol % refers to a summation of all the positive 
contingencies, and let 

t|/ = S (n„„ — i'„„)/N ' . . . (iii.), 

then t/» shall be spoken of as the mean contingency. Clearly any functions of either 
(fr' or xjt would serve to measure the contingency. We shall be guided in our choice 
of such functions by considering what are the values of (jy^ and x}j in the case of normal 
correlation. 


(3.) On the Relation between Mean Square Contingency and Normal Correlation. 

Let X and y denote the deviations from their respective means of two characters or 
attributes, of which cr^, cr^ are the standard deviations and r is the correlation. Then 
if we assume a normal distribution of frequency, Zg §« Sy would be the frequency ot 
individual pairs falling between x and x -\- Sx, y and y + Sy, where 


N 

Zn = 


on the assumption of independent probability, and z Sx Sy, where 


*(5-$) (iv.), 


N _, 1 / x^ 2rxi/ I y' \ , V 

^ = ~» T; 1 « *l-r=W^ <r.T/<r,// (V.), 

27r\/ 1 — r^d^a-y 

on the assumption of contingent probability. 
We then have at once 

,a _ g \ {zSxSy~z^SxSyY \ _ g | (z - ^oF a^ §„ 

and we have only to insert the values of z and Zq, given by (iv.) and (v.), and integrate 
all over the plane of x, y, to find the mean square contingency. 
Now, if ac > W, we know that 

1 p pg_j(a.-2te,+c,»)^jg^^^ _ _j^^ (vi.). 

27rJ-=oJ_« ^ac — h^ 
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This is all A\e need, for if r = o-,..i', y = cr,y' ■ 

1 r 1 r+" r+=° -x^ x'^^i^.-'^^Lv' + ,i'-''^-±^\ i > , , 
_ 1 ]__■_ g-i^T^^ i-'^ -' i-r'> dx dy 

2tt\i - r2J_„ J_» 

\/l r^ J_a> J-oo 

f + oo f + oo 1 

— 00 J —00 J 

^ _JL_ J, __„J ^ + 1 . (vii.), 


1 r^ 

-2 + 1 = 


1 — r^ 1 — r" 

Thus the mean square contingency is simply r^l{l — r'^). • Or, 


r = 4- a/ — ^ (viii.)- 


Thus the relationship between mean square contingency and correlation in the case 
of normal frequency is of an extremely simple character. 
We see at once : — 

(i.) That since the mean square contingency is absolutely independent of the 
arrangement of our classes, the coefficient of correlation is also entirely 
independent of the arrangement of our classes on the basis of any assumed 
order or scale. 

(ii.) Provided our classes are sufficiently small to allow of us legitimately 
replacing by groupings over small areas the theoretical integrations, the 
coefficient of correlation can be found from the mean square contingency. 

We have thus an entirely new method of finding correlation in the case of 
quantitatively non-measurable characters. It assumes, however, that our classification- 
groups are sufficiently numerous and their contents sufficiently small to justify us in 
supposing that the contingency has reached a definite limit. Clearly in working in 
the future by the contingency method, we shall have to adopt rather more numerous 
classes, and they should not contain too iiregular proportions of individuals, but we 
can then affi)rd to drop any question of scale or order of grouping. 

It may be asked whether this method of deriving the correlation from the 
contingency cannot replace the earlier method of deducing the correlation by the 
fourfold division of the material. The answer is that in some cases it can do so very 
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advantageousl}', but it is very far from doing so in all. The contingency found from 
a fourfold table is a perfectly real and very proper measure of the deviation of its 
material from independent probability. But if this mean square contingency be 
substituted in equation (viii.), it will not give us the correlation. The proper mean 
sqviare contingency to give us the correlation must be based on a sufficiently large 
num|)er of classes. When, however, ^\'e take, say, 20 classes for each attribute, we 
have 400 terms to deal with in calculating <j)^, and although the result might then 
possibly give a more accurate value for the correlation than that found from a fourfold 
division, yet the labour of determining it is far greater and may be excessive. 
Further, the simple classification into two or three groups may be all we are able to 
make at all, or all we can conveniently make. Hence the new conception of 
contingency, while illuminating the whole subject — especially as demonstrating that 
the correlation is independent of scale or grouping, does not do away with the older 
method of the fourfold division, I propose to call the expression 


v: 


^' 


i + <^2 

the^7'.<;^ coefficient of contingency. 

We note that with small enough classes the coefficient of contingency becomes the 
coefficient of correlation. Accordingly, with a view of lessening the number of 
coefficients in use, I adopt the following convention : Any expression or function of 
either the mean square contingency (<^^) or the mean contingency (i|;) (or indeed of 
any other measure of the contignency), which, when the grouping is sufficiently small, 
is theoretically equal to the coefficient of correlation — on the hypothesis of normal 
frequency — shall be termed a coefficient of contingency. All such coefficients of 
contingency must, on the same hypothesis, become equal on a sufficiently small 
grouping, and they will scarcely differ widely from each other when the frequency is 
not absolutely normal and the grouping is merely moderately small. These points 
will be illustrated later. 

(4.) On the Relation of Mean Contingency to Normal Correlation. 

A great deal of the labour of finding either the -coefficient of contingency or the 
coefficient of correlation by the method of mean square contingency when the groups 
are numerous, depends upon the squaring of the contingencies and dividing by the 
frequency to be expected on the basis of independent probabilities. The whole of 
this labour is escaped, if we work with the mean contingency instead of the mean 
square contingency ; further, since in this case we only sum for the positive con- 
tingencies, neglecting the negative, we have usually to deal with only, or. often less 
than, a moiety of the terms involved in calculating ^^. On the other hand, there is no 
simple relation between the correlation and the mean contingency such as we have 
found between correlation and mean square contingency in equation (viii.) above. 
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The relation is far more complex and is only expressible in the form of int^egrals 
reducible by quadratures. Still for practical purposes we rarely svant the coetticient 
of contingency to more than two decimal places. Hence, if the integral be evaluated 
for the coefficient proceeding by equal intervals, we can plot a curve givmg the value 
of the coefficient of contingency in terms of the mean contingency, and this will be 
sufficiently accurate to enable us to read off the former in terms of the latter to the 
required degree of accuracy. The enquiry also brings out some other points not 
without interest.* 

To investigate the curve ivhich in a normal correlation surface separates on the 
plane of xy areas of positive from areas of negative contingency. 

The frequency due to independent probability will be equal to that due to the 
actual contingent probability when 

-1-^ e~H^»+^=/ = -'-^ — e 1-'-' ^"'^ """' "■'' , 

2.'!T<T^cry 'iira^cO-y y^l — r'^ 

where r is the coefficient of correlation, or of contingency. 
Clearly 

(,-,.. ),„g.(i-,-) = -,.= {^-^+r}. . . . (ix.); 

Since r is always less than unity, this curve is clearly a hyperbola, which possesses 
several interesting properties. We see at once that all the contingency of one sense 
is grouped into the space between the two branches of this hyperbola, and that the 
contingency of the other sense is grouped into the two sepai'ate spaces inside the two 
branches. Thus contingency of either sense is for normal correlation continuous, and 
abrupt changes of sign in the contingency — beyond the limits of random sampling — 
are not to be expected. 

By testing on actual correlation tables I find this hyperbola comes out in a fairly 
marked manner, in fact, quite as significantly as the elliptic contours of equal 
frequency. 

I propose to consider the properties of this zero contingency hyperbola — it forms 
the curve along which two really contingent events have a frequency identical with 
their independent probability. 

Consider the two families of curves : 

x^ Irxy . ■xP' 

7-3 " + -3 = « (X.), 

+h=^ (^i-)- 


r a-^a-y 


* I have to heartily thank my assistant. Dr. L. N. G. Filon, for the substance of the first part of the 
investigation given below, down to equation (xiii.). I owe the calculation and plotting of the curves 
w = ^-«sece to tny assistant, Mr. J. C. M. Garneti'. 
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Since r is always < 1, the a family form a set ot concentric, similar, and similarly- 
situated ellipses, and the /3 family a set of concentric, similar, and similarly-situated 
hyperbolas. Any conic having double contact with the hyperbola ySg, of zero 
contingency defined by (ix.), at the ends of a diameter y = mx, has for its equation 

If this be identical with an ellipse a, we have, by comparing coefficients and 

eliminating X and m, 

ySoVa^ = 1/rl 

Consequently « = i r/3Q, the sign being determined from the fact that a must 
always be positive for real ellipses. 

Now the ordinate z of the normal frequency surface is given by 


z ■= 




N 


^TTo-^ay \/\ — r 


e 2(1- 


and to find the mean contingency we must determine the whole volume lying inside 
the two branches of the above hyperbola, integrating on hoth sides of the line of 
contact of the families of hyperbolas and ellipses.* 

We have U^ dx dy over this area 

where 

J ^ g («, IS) ^ _ 4 (1 - r^) f^ _ f\ 
8 (x, y) ra-^a-y V/ a-// 

from (x.) and (xi.). 

But from (x.) and (xi.) 

{(4 + fJ- ;^1 (1 - ^^)^ = {-' - ^V) (1 - r^). 
Or, choosing the signs to make J positive, we have 


ra-^a-y 

* The ellipses and hyperbolas have common pairs of conjugate diameters ; onB line of contact is one 
of the asymptotes of the hyperbola -^ - ^— = 1 ; and tangents at an intersection point of any of the 

family of ellipses with any of the family of hyperbolas are respectively parallel to conjugate diameters of 
this hyperbola. These geometrical i^roperties, however, need not detain us here. 

B 2 
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Thus the required integral is 

= ^ , f" cos-i ^0- e--^^) da. 

277(1 -r^)irp, a 

To simphfy put, using (ix.), 

where k will always be positive, since r < 1. 
We have 

I, = ^ [^g-^seco ^ ggg ^ tan de, 

ttJo 

or, integrating by parts, 

l, = ~['\-^''''de (xiii.). 

TTJn 


The curves u = e'''^"^ were then plotted with our coordinatograph for a series of 
values of ^ or r on a large scale, drawn in with a spline and integrated with a Coradi 
compensating planimeter. The values of I,, resulting are tabled on p. 15. 

We have next to investigate what is the volume NQ,. of the surface of independent 
probability 




'ilT(Tx<Ti/ 


e 


which falls within the same hyperbola of contingency. We shall then have in Q, — I;. 
the required value of t/', the mean contingency on the basis of normal correlation. We 
have 

Qr = ^ j I fi^H^'^i^) dx dy 

taken over the space inside the two branches of the hyperbola 






Write X = x'ctj:, y = y'cTy, and we have 

Transform to polars, p cos 6 = x', p sin = y\ 

r — sin 26 ' 
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This shows us that the axes are given by ^ = - and - - + - , or are a and h, where 

a^ = r;8o/(H- r), 6^ = r;8o/(l - r). 
Take these axes as axes of coordinates. Then we have to integrate 


Q, = i-|fe-*('=^+'^^>c^aj(^y, 


over the area inside one branch of the hyperbola 

(1 + r)x^ — (1 — r)y^ = r^Q (xiv.). 


Let 

gj2 _[_ ^2 _ gj^ 


(XV.), 


x^ - 2/3 + r (x^ + f) = r/SJ 

and let us transfer the integrations to a and /8. 

We have 

x^ =^{a — r{a — fi)}, 

?y' = i-{« + r(«-|S)}, 

and 

over one-half one branch of the hyperbola. 

dy dx dx dy r r / ^' ' 

Thus we have 

The limits are obtained from the consideration, easily seen on a figure, that for a 

1 ■4- r 

given a we must integrate from fi — ySg, the given hyperbola, to ;8 = — ^^^~— a, the 

touching hyperbola ; and then for a we must take every circle from that touching /Sg, 
i.e., a = r^J{\ + I") up to infinity. 

We will first integrate with regard to yS, and put 

?• (a — i8) = — a sin <^. 

This gives, when /3 = (1 -{- r) a/i\ (jj = ^n ; and when 
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cos 

a 


l + r 

Take 


then 


Hence 


cos X = *' (^0 - «)/«' 
a = 00 , cos X = - '■= 

« = ri8o/(l + r), cos x = 1- 

= i- e *r+cosxax, 

IT Jo 


observbg that the term between the hmits vanishes at both. 

Take , v ,/ , , \ 

cos ^ = (r + cos x)/('- + !)• 

Then „ „ 

X = 0, = 0, 

■^ = GOS-^{—r), 6 — ^1T. 


Thus we find finally, after some reductions, 


where 

e = (l-r)/(l + r). 


K = i^=-ii-Mog.(l-.-) 


(xix.), 


= (1 — 7-)k, of the integral I,. 

Tables were now formed of e and k and the ordinates of the curves 

-«sec« . /l + cos^ , . 

^ e + cos 6' ' 

calculated.* These ordinates were plotted on a large scale by aid of a Coradi 
coordinatograph and the resulting curves integrated as before, the values of Q,. thus 
found are given with the values of I,, and xjj in the table below. I believe this table 
gives the mean contingency in terms of the correlation true to at least three places of 
decimals. The u and v curves are both interesting analytically and subject to rather 
curious changes of type. We were aided in plotting them by calculating, where 

* I owe the calculation of lliese ordinates to Dr. Alice Lee. 
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needful, -j^ and -,- . Finally, the values of r were plotted by my demonstrator, 

Mr. L. W- Atcheeley, to the corresponding values of i/(. Thus a curve was obtained, 
which enables vis to read oft' the correlation from the contingency correct to at least 
two places of decimals — sufficient for nearly all practical purposes. 

Table I. — Table of Integrals I,., Q,., and the Contingency i/; for Values of r. 


r. 

I,. 

Q,-. 

f- 

0-00 

•5000 

•5000 

•0000 


05 

•4620 

•4762 

•0142 


10 

•4342 

•4652 

•0310 


20 

•3895 

•4536 

•0641 


30 

•3501 

•4498 

•0996 


40 

•3162 

•4547 

•1385 


50 

■2830 

•4643 

•1813 


60 

•2489 

•4814 

• 2325 


70 

•2128 

•5106 

•2978 


80 

•1700 

•5524 

•3824 


90 

•1186 

•6279 

•5093 


95 

•0796 

•7009 

•6213 

1-00 

•0000 

roooo 

roooo 


Diagram I. at the end of this memoir will therefore serve for most purposes of 
interpolation, and it will be seen that now that the integrals have been evaluated and 
the diagram constructed, the correlation can be very easily found from mean con- 
tingency. But the method seems to me distinctly inferior to that of mean square 
contingency, and this for much the same reasons that mean error calculations are 
inferior to mean square error work in curve fitting. Further, the grade of contingency 
can be found at once from a knowledge of mean square contingency, and whatever be 
the distribution is a significant and interpretable constant. This is only true of the 
correlation deduced from mean contingency if the distribution be normal. 

(5.) To sum up our results so far : — 

We have, if 

n„ be the actual frequency of a group in the population, N which combines the 
characters A„ and B„, v„ be the frequency of this group on the hypothesis of 
independent probability, then 

n„v — v„ is simply a sub-contingency, 

S \ Vhn^ZJ^J- \. z= -wS may be termed the square contingency, 

S J V^"" "~ ^""L I =r ^2 jg ^]-,g mean square contingency, 
L Nj/™ J 


Uu 


N 


= i//, where % is the sum for positive (or negative) sub- 
contingencies only, is the mean contingency. 
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Auv one of tliese expressions is a measure of the deviation of the system from 
independent probability, and therefore of the amount of association or correlation 
between the characters or attributes involved. But any function of these expressions 
Is also a proper measure. Sucli functions are : — 

(a.) The contingency grade. This is 1 - P, where P is to be found from x' by aid 
of the tables for " goodness of fit." See ' Biometrika,' vol. I, pp. 155, et seq. 

(b.) The mean square contingency coefficient = C,, where 

P — a/ ^^ .... (xxi.). 

(c.) The mean contingency coefficient = C., where Cj Is to be found from the table 
on p. 1 f) or from Diagram I. at the end of this memoir. 

In the case of sufficiently small grouping and normal correlation we have 

Ci = Cj = coefficient of correlation. 

But it must not be forgotten that this is essentially a limiting, not a general case. 
Nevertheless the approach to equality of the two contingency coeflScients will be a 
good measure of the normality of the distribution and the suitability as to smallness 
of our elements, of grouping. 

(6.) A little experience of actual working, however, shows that in practice it is 
perfectly easy to overshoot the mark in fineness of grouping. Suppose that in 
dealing with 1000 cattle we find a single Instance of a calf inscribed as " mulberry," 
say the offspring of a red cow by a dark fawn bull. Now if there be 30 dark fawn 
Imlls, the independent probability of a dark fawn bull having a mulberry offspring 
Is "03. Hence the sub-contingency for a c? parent-offispring table =1 — "03 = "97, 
and the corresponding contribution to the squa,re contingency will be ("97)^/03, or 
is upwards of 31. The fact is, that when we come to very fine groupings we get at 
once into difficulties owing to our having to record by units only. Suppose 
"mulberry" calves actually had no relation to any special parentage, but were rare 
anomalies occurring once among 1000 calves, or perhaps were merely an odd breeder's 
fancy description, then a unit cannot be divided in the proportions of the colour 
parentage, it must fall into some one colour parentage group. The result is 
that a few isolated individuals will give large contributions to the mean square 
contingency. The above example is purely hypothetical, but similar cases have 
actually occurred in dealing with colour problems by the contingency method. They 
are exactly similar to those which occur when dealing with outlying individuals by 
the te.st for " goodness of fit." In a frequency distribution we proceed only by units, 
but the theory gives fractional values of the frequency ; hence in forming the value of 
X^ to measure goodness of fit, one or two unit "outliers," although not improbable as 
far as the whole of the tail of a curve is concerned, may be exceedingly improbable if 
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considered from the standpoint of the actual group in which they do occur. This 
point must be carefully borne in mind in actual practice, for by sufl5cient refinement 
of grouping, i.e., till we reduce certain groups to a single individual or two, the mean 
square contingency can be increased in a remarkable manner. 

(7.) Of course this is merely saying that the probable errors of the sub- 
contingencies increase largely when we make v„ very small. Unfortunately I have 
not yet succeeded in determining the probable errors of the contingency coefficients. 
If c™ be the contingency, determined by 




^UV ' "itv 


and %c„ its standard deviation for random sampling, I find 

i.„. - n„, [\ - -~j + _ [n,, 4- n, ^j ~ ^ N^ V N~ / ' ^ ''' 

so that the probable error of any individual contingency = "67449 S(,„ is determined. 
Further, if Rc„.<;„s,. be the correlation between errors due to random sampling in two 
contingencies c„„ and c„,„/, not belonging to either the same row or column, 

"f p * ip ^xxm.;. 

Similarly we find for the correlation of errors of two contingencies of the same 
column, Rcc,,,', the result 

■*c„'4c„'-C>'c„„-t»'c„' — ^^f 5V I J- — 


N N \ N 

+ W^/l_l!L«\ • (xxiv.), 


and for errors of two contingencies of the same row, 

^ •C D "^VAp'ti/v '^K'^ u'v ~r ^m'^ to / -I "^_« 

^0„^(;^\^0„C„% TVT AT \ -^ AT 


N N \ N 

. Eesults .(xxii.) to (xxv.) enable us to find the probable errors and the error 
correlations for any individual contingencies which will arise from random sampling, 
and are so far of value ; but when we attempt to find the general expression for the 
probable error of either the mean or mean square contingency, it becomes so complex 

c 
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that there appears little hope of deducing a simple result. Arithmetically the 
problem might be solved at the expense of rather troublesome numerical calculations 
if the number of sub-groups was not very large. A general and simple expression for 
the probable error of xjj or (f>^ involving xjj or <f>^ only does not appear likely to exist, 
and an expression involving all the sub-group frequencies would be very troublesome 
for computation. Practically the errors of the contingency coefficients may be fairly 
reasonably taken to lie between the probable errors of r as found by a fourfold 
division of a table and by the product method, approaching the latter more closely as 
the number of sub-groups is sufficiently increased. With the experience of probable 
errors of fourfold tables before us we may, I think, safely take the probable error of a 
contingency coefficient C for rough judgments to be less than 

2 X -67449 ^ 


x/r 


i.e., double the probable error of a correlation coefficient found from the product 
moment. At the same time we must distinctly be cautious, remembering the difficulty 
as to isolated units referred to in the previous section. 

We may look at the probable error of the contingency from another standpoint. 

Taking the mean squared contingency, we have 


1 + f 


.2 


1 —r^ 
Therefore 

'^ (1 _ r^f °'^' 
and accordingly, if t^,_, tr be the standard deviations in errors of <f,^ and r, 

% — 2r ^ 2r i — y2 * 

Hence if we were to determine ,^« from r, the probable error of d>^ would be 
given by 

Probable error of <^2 = -6/449 -^ ^(j ^_ ^2) ^2 

Or, we can put it into the more useful form, 

Percentage probable error of <l>^ = ^'^^^^ . /iT^ u^^i \ 

Thus the percentage probable error increases rapidly as the contingency gets smaller. 

* 'Phil. Trans.,' A, vol. 191, p. 242. 
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Of course, the probable error of (j>^ as found from r is not necessarily the same as 
the probable error of «^* found directly, but it may serve as a guide to its approximate 
value. 

If it were the same, the probable error of r as found from ^^ would be 
•67449/ {(1 + (j)^) v/N}, a result, as indicated in the previous paragraph, much too 
small, except possibly for very successful systems of grouping. 

(8.) To find under what other condition than normal correlation small changes in 
the order of grouping will not affect the value of the correlation. 

Let us assume the unit of grouping to be very small, but not necessarily the same 
for all groups. Let the two characters or attributes be x and y, and suppose n, to 
be the total frequency of individuals in the range y, — e to yj -\- c, and n,+i to be the 
total frequency in the range y,+-i — e' to 3/,+^ + e'. Let y,+i — y, = e -\- e' = h be so 
small that its square may be neglected. Let x, y be the mean values of the 
characters, N the total frequency. We will find the changes in the moments and 
constants supposing the array m, and n,+^ interchanged in position. 

Clearly 8x = and So-,, = 0. 

N (^ + Sy) = S {y,n,) + h{n, — n,^{), 


or, 


Next if 


Sy = h {n, — ri,+i)/N. 
N (o-y + ^o-yY = S {yM) + 2^ {y,n, — y.+^n^+i) — N (^ + Syf* 
2(Ty So-y = 2h (ysU, — «/,+iW,+i) — 2% S^, 

^y _ A (y ^ — y) ^'^ ~ (y^+i ~ y) ^^^+i . 

xTy a-/ N 

P = S (xy) — 'Ny X, 
P + SP = S (xy) + h {n,x, — ns^-^Xs^■^) — N^ x — Na; hy, 


or, 


SP = ^ {n, {xs — x) — n,+i [xs+i — x)], 

where x, and aj^+j are the means of the arrays Us and n^+j. 
But if r be the correlation coefficient of x and y characters. 


Therefore 


* It must be noted here that the squares of the change in y and o-y are neglected. Hence the changes 
must not be so great that Sy and ^y are sensibly as compared with y and vy, 

C 2 


NcTiO-j, 


Sr _ SP So-^ 

h<Tj 

r P cr. 

a-y 
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and substituting the above values, 

8r .in, (x , -x)- n,^, {x,^^ - x) _ {t^- y ) n, - (y ,^i -^y^n^j^ 1 

7 = ^1 p^ ' " Ncr/ r 

If this is to vanish for any value of s and h, it will be sufficient, since 

P = r X No-:,(ry, 

x, — x = -^-^ {y, — y), 


and 


x,+i -x = '^^{y,+i-y). 


Or, if the mean x„ of any 7/„,-array of individuals be determined by 

'x„, 33 = \ym y )• 

But this is the condition for linear regression. 

Hence we conclude that in any correlated system of variables, obeying the law of 
linear regression, we can, without sensibly modifying the correlation, interchange two 
adjacent ^/-arrays {e.g., two rows of the correlation table), provided the grouping be 
fine. But if we can interchange any two adjacent y-arrays, we can, by a repetition 
of such changes, interchange any two y-arrays whatever ; and a precisely similar 
statement must be valid for any two .^-arrays {y.g., two columns of the correlation 
table). Hence, given a sufficiently small system of grouping, we may state that in all 
cases of linear regression the actual order of the scales is immaterial as far as the 
determination of the correlation is concerned. 

The practical importance of this result would ajapear to be great, for it frees us 
when dealing with scale orders from the need for supposing normal frequency ; the 
indifference of the scale order when determining correlation is still true, provided the 
regression is linear ; and this linearity of regression is not only found from observation 
to be very general— for example, in inheritance problems* — but follows from theory 
itself in the case of various hypotheses.! 

In actual practice, of course, the degree (>f fineness of the grouping is limited by 
many considerations, and hence it will often be better to proceed by the fourfold 
division method, taking that division where possible at a very distinct ckssification. 
But the general principle now demonstrated will enable us in future to pay much less 

* See " The Laws of Inheritance in Man.— I. Inheritance of the Physical Characters," ' Biometrika ' 
vol. 2, pp. 362-3; also "Inheritance of Mental and Moral Characters in Man," 'Huxley Memorial 
Lecture,' 1903. ' Journal of the Anthropological Institute,' vol. 33, pp. 185-7. 

t "Contributions to the Theory of Evolution.— XII. On a Generalised T^ory of Alternative 
Inheritance, with special reference to Mendel's Laws." ' Phil. Trans.,' A, vol. 203, p. 85. 
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attention to the actual order chosen for the scales if we are dealing with a class of 
characters for which we may reasonably presume the regression to be sensibly linear. 
(9.) If we take the crudest possible division of our material into only four groups, 
thus : — 


a 

c 

a + c 

d 

h 

d + b 

a + d 

r. + h 

N 


corresponding to what Mr. Yule has termed the association of two attributes, we 
have at once 


_ 2 {ah - cd) 


^" - (a + d) (c + i))Ja + c) {d + b) 


(xxvii.). 


(xxviii.). 


Now it is clear that in this case ^- reduces to r^^, where ru is the correlation 
between errors in the position of the means of the two characters under consideration, 
as determined by a fourfold table, and ^xfj is in this simple case what I have defined 
as the transfer per unit of total frequency.* Both are expressions intimately 
connected with the conception of association, and have already been discussed in 
relation to it.f The coefficients, C^ and C^, of contingency — either of which might 
serve as a measure of the association — will not in this simple case, however, be 
necessarily even approximately equal to each other, still less to either the coefficient 
of correlation or Mr. Yule's coefficient of association.! 

It is worth while illustrating this on a numerical example. Taking the small-pox 
returns for the epidemic of 1890, we have : — 


Cicatrix. 

Recoveries. 

Deaths. 

Totals. 

Present . . . 
Absent . . . 

1562 

42 

1604 

383 

' 94 

477 

Totals. . . . 

1945 

136 

2081 


* 'Phil. Trans.,' A, vol. 195, pp. 12 and 14. 

t Ibid., p. 15 et seq. 

X 'Phil. Trans.,' A, vol. 194, p. 272. 
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These give us (jy^ = '0845, x^ = 175 76, xjj = •0604. From these we find 

Ci = -279, C, = -190. 

Yule's coefficient of association = "803. 

Coefficient of correlation by fourfold division = '595. 

Grade of contingency = 1 - P,* where P = 718/10*°. 

Now so far as numerical values go these things are all totally different. Cj, Cj, 
and the coefficient of association depend very largely on where the fourfold division is 
taken, t It is extremely difficult to use them therefore for comparative purposes. On 
the other hand, the coefficient of correlation with the assumption, however, of 
normality is free of this restriction ; it brings us into line with other things for 
comparative purposes. The grade of contingency is also independent in a sense of 
the division, i.e., it has a definite physical meaning. What it tells us is this, that the 
deviation from independent probability in the relation between result, a case of 
small-pox and presence or absence of cicatrix is such that the above table could only 
arise 718 times in 1 0*° cases if the two events were absolutely independent. 

If, instead of a table like the above, we take a number of alternative possibilities 
for each attribute, the coefficient of association loses its uniqueness of meaning ; 
Cj and Cn still retain their significance, and as the number of alternatives become 
greater, merge in the coefficient of correlation. The grade of contingency, on the other 
hand, retains the same perfectly definite meaning throughout. I think this statement 
may serve as some warning of the caution needful in using the coefficients now 
introduced. The degree of approach of both C^ and Co to the correlation must be 
studied for each special class of cases, and only when this has been done will their 
use be really legitimate and effective. 

(10.) On. the Relation, between Multqylc Contingency and Multiple Normal 

Correlation. 

Suppose instead of a single correlation. table we have a multiple correlation system. 
Such a system is well illustrated by the cabinet at Scotland Yard, which contains the 
measurements of habitual criminals on the old system of body measurements, now 
discarded in favour of a finger-print index. We have in this case a division of the 
cabinet into 3 compartments, which mark a threefold division of long, medium, and 

* When the number of groups = 4, we have ('Phil. Mag.,' vol. .50, p. 157 et seq.) : — 

P = ^/^r,-i^'dx+ J^e-^x. 


whence P is easily found if x" be large, 
t Yule, lor. rAt., p. 276. 


^ L X- X* X' x^ J 
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short head lengths. Each of these vertical divisions is then sub-divided horizontally 
into three divisions giving the corresponding divisions for head breadth ; each of these 
head-breadth divisions has three drawers for large, moderate, and small face breadths. 
Each drawer is sub- divided into three sections for three finger groups, and these again 
into compartments for cubit groups, and so on. If this be carried out for the seven 
characters dealt with, we should have ultimately 3''' sub-groups forming a multiple 
correlation system of the 7"" order.* We may ask what is the mean square 
contingency of such a system and to what extent does it diverge from an independent 
probability system ? Of course, for an ideal anthropometric index system the 
divergence should be very slight. 

Let x^, iCg . . . a;„ be the n variables of a multiple normal correlation surface, to which 
the equation is 


Z = 


(27r)*" 0-10-2 


pxnt, _ 1 is (^pp ^-i" 1 J- 2S5 (pi *"*'" 


Here a-i, o-g . . . cr„ are the standard deviations of the n variables ; S^ denotes a sum 
of all values of p from 1 to n, S^ a sum of all imlike values of ^ and q from 1 to'»! ; 
while E, is the determinant 


1 , 

'^n> 

"^13 > • 

r 

**21. 

1, 

^33 ' • ■ 

■ ■ ^i 

• • 

'^sa' 

1 , . . 

■ • • 

Tnl, 

■'"rea ' 

*'jl3 ! 

. . ] 


m 


2m 


and B/,/ is the minor corresponding to the constituent r,t, and the r's are the 
correlation coefficients, f 

Now if <^^ be the mean square contingency, we have 

^^ = wr r [*"■•• f ^"^ ^^^^ dx, dx,... dx„, 

XS J —00 J ^00 J —00 J —00 "Zq 

where Zq — value of « when all the r's are zero, or 

^0 = i^r^ii^ expt. - i ] Si(^) -. 

Thus we have, writing Xp = a-pscfp, etc, 

* See Macdonell, "On Criminal Anthropometry," ' Biometrika,' vol. 1, p. 205 et seq. 
t 'Phil. Trans.,' A, vol. 187, p. 302, or Ibid., A, vol. 200, pp. 3-8. 


24 PROFESSOE K. PEARSON ON THE THEORY OP CONTINGENCY AND ITS 
where 


Now 


where 


Co=expt. -i{S,(x'/)}. 


= {27ry"U/A (xxix.), 


A = Cii, c 


12) "^13; 
C2I, ^22, C2S, 
Cal) ^32) ''33) 


^nU *^!t2> ^ii3> 


^2,1 




We are now in a position to find all the integrals involved in the equation for <f>^, 
we have 


«^' = i, -7=, - 2 + 1 - — ^ - 1. 


w;here 


A' = 


2Ru 1 
R 

2R21 
R ' 

2R3, 
R ' 


2Rj2 

R ' 


Rn/A' 


'"-tV22 

"rT 


- 1, 


2R13 

R ' 
R ' 


2R5 
R 


2R3; 
R 


- 1, . . 


2Ri„ 
R 

2R2. 
R 

2R3H 
R 


2Ra 

R 


2R, 
R 


R 


2R.. 
R 


To evaluate this determinant, we notice that since r^j, =: 1, we have, if p and q be 
difiierent, 

-tVi'Vi I -tVa'Vz ~r Rjog'ps ~r • • • "r Rpn't'pn = R; 


Hence 


Rpl^n + Ri»2'V2 + Rps'^yS + • . • + 'RpnTgn = 


2R, 2R^ 2R^3 , /2R^_i\,. , . 2R, _ 


R 


and 
2R, 


R 


2R, 


R 


2R, 


n 


R 


P^n + ^ 'Vy2 + ^n,3 + . . . + ^^ - 1 r,, + . . . + 


R 


R 


2R. 


R 


2R, 


'pn 


R 


' on — '^i/n- 


qn 


9P- 
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Now multiply the determinant A' by the determinant R, we find, using the above 
relations, 

A'E, = 1, — r,„ — ?',„ . . . — r, 


''ai) 1) — ■''23) 


— »'3b — 1' 


32: 


1, 


fV» ^^ '1'* ^^ 1'' 


' 2)1 


— T-,, 


1, 


= R', say. 
Here R' is R with the sign of all the correlations changed. Hence it follows that 


^2 — 


s/RR' 


- 1 


(xxx.). 


Special Cases. 
(i.) Simple correlation 

R' = R = 1 — r^2^, and <jj^ = o\:^/{l — r^^), as before. 

(ii.) Triple correlation 

-t*- = 1 ~ '''zi ~ ''^'sr ~ ■''12" + 2^'23''3i''i2) 

-t*- ^ 1 ''23" ■''31 '"l3 — 2r237'g[7'j2- 

1 


«^^ = 


y(l - r^s^ - rsi^ - V)2 - 4ro,V,,V ^ 


— 1. 


'23 '31 'n 


(iii.) Quadruple correlation 


RR'={l-ri., 


r^s' — r 


14, 


■ Tco- — ?- 


24 


■'"34. + ''l3"'''34. "r''"23''l4 +''l3~''24, ^ V 12^'l4^23''3 


+ ''l4,''l3'''23''24 + '']2*"l3'W34.)] ~ ^ 1 '^23''''24..'''34, + '*34,*''l4''l3 + '"l2'"l4''2.1. + '']2''l3^23}''> 

and so on. 

Clearly a condition has to be satisfied among the correlation coefficients, or the 
process by which we have deduced <^^ is not legitimate. We must have A positive for 
equation (xxix.) to be true. Now, for normal correlation R must be real and positive, 
or the equation to the multiple correlation surfaces become imaginary. Hence it 
follows that A' must be positive, and therefore R' must be positive. This seems to 
give a definite condition to be satisfied by the correlation coefficients, and in some 
cases rather narrow limits are enforced. For example, in the case of triple correlation 
we must have 

1 ~ ''23 — '''31" ~ ''13' ■" ^rggrg^^'ij 

positive, and this appears to reduce very considerably the possible values for the 

D 
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correlationship of three characters.* The source of this novel condition appears to lie 
in the integration of the term ^7Co> ^iid this is only possible by use of equation (i.), 
provided the surface Z = t^jt^ has " ellipsoidal " contours. If it has not, we may get 
the subject of integration becoming infinite with one or other of the aj's, and 
consequently, although both I, and ^q vanish at oo, ^y^g may not do so, i.e., the mean 
square contingency tends in certain tracks to become indefinitely large. In fact, our 
method of deducing multiple contingency from the normal correlation coefficients is only 
valid provided the system is not only a possible correlation system with the given values 
of the coefficients, but also when these coefficients all have their signs reversed. 

(11.) Illustrations. A. — Stature in Father and Son. 

Table II. gives the distribution of 1078 cases of stature in father and son.f The 
correlation r, as found from the product moment in the usual way, is '5 14. 

I propose to consider the approach of Cj and Cj to r as we increase the fineness of 
the grouping. Clearly it would involve extreme labour to work out the contingencies — 
especially the mean square contingency — for the table as it stands. 

To begin with I classed in three inch groups and got the following .table, in which 
the figures in brackets are the independent probabilities. 

Table III. — Stature of Father and Son in Inches. 




Stature of Father. 

Totals. 

Chances. 


i6 

1 — 1 

CD 
1 

in 
00 

to 

1 

ICl 

1 — ( 

to 

id 

to 

id 

o 
i- 

1 
in 

J:- 

to 

id 

CO 

in 
o 

id 

1 
in 

CO 

Stature of Son. 

58-5-61-5 
61-5-64-5 
64-5-67-5 
67-5-70-5 
70-5-73-5 
73-5-76-5 
76-5-79-5 

("^5) 

3-5 

(-84) 

8-5 
(-4-02) 

2-5 
(6-07) 

(2-87) 

(-56) 

(ao) 

1-5 
(-36) 
19 

(6-50) 

53-75 

(31-07) 

33-25 

(46-86) 

3-5 
(22-13) 

(4-31) 

(-77) 

2 

(1-20) 
33 

(21-75) 
148 

(104-03) 

149-25 

(156-90) 

39-75 

(74-10) 

3 
(14-44) 

(2-59) 

(1-32) 

5-5 

(23-87) 

80-5 

(114-15) 

202 - 25 

(172-17) 

104-25 

(81-31) 

14-5 

(15-84) 

4-5 

(2-84) 

Tso) 

1-5 

(9-02) 

8-25 

(43-14) 

60-25 

(65-06) 

62 

(30-73) 
20-5 
(5-99) 
3 
(1-07) 

(-03) 
("^5) 

(2-64) 

3-5 
(3-97) 

3-5 
(1-88) 

2-5 

(-37) 

(■^7) 

3-5 

62-5 
299 
451 
213 

41-5 
7-5 

■0032 
•0580 
-2774 
-4184 
-1976 
-0385 
-0069 


Totals . . 

14-5 

112 

375 

411-5 

155-5 

9-5 

1078 

1-0000 


* For example, if - 5 be the value of parental correlation, then the correlation of two brothers could not 
exceed • 5 without making R' negative, 
t See ' Biometrika,' vol 2, p. 415. 
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The independent probabilities were found by multiplying the " chances " of a son 
occurring in each group by the totals for each group of fathers. Taking the difference 
of the observed sub-group frequencies and the independent probability frequencies, we 
have N X V» = 205-62 from the positive and = — 205-66 from the negative differences, 
a quite good agreement. Hence we find \p = 'IQOS. 

Using Diagram I. we have 

Cj = -522. 

Proceeding now to the mean square contingency obtained by squaring all the above 
found contingencies, dividing each by the independent probability frequency and 
summing, we find 

<^2 = -2755, 
whence 

Ci = -465. 

The value of C^ is clearly too small. We must infer that our grouping was not 
fine enough. Accordingly in Table IV. I have re-arranged the matter in 2-inch 
groupings, an4 have then in the same manner proceeded to find xjj and (j)^. In this 
case I found t/» = -2013, and thus 

C, = -542, 
while 

(j>^ = -3568, 
and 

Ci = -513. 

I thus conclude that the grouping is now fine enough to give C^ and Cg 
approximately equal to the correlation.* 


«'.«., within the probable error of that result. 
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Table IV. — Stature of Father and Son in Inches. 




Stature of Father. 

Totals. 

Chances. 
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>o 
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CD 

CD 

r* 
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59 -5-61 -5 



15 

1 

1 



_ 


3-5 

-00325 



(^2) 

(■^8) 

(■31) 

(•77) 

(•95) 

(~84) 

(~41) 

(-11) 

(■02) 




61 -5-63 -5 

•5 

2-75 

5 75 

9 5 

5 

•25 

-25 





24 

-02226 



(•14) 

(-56) 

(2-11) 

(5 ^29) 

(6 •49) 

(5 ^73) 

(2 -83) 

(-72) 

(•12) 




63 -5-65 -5 

4 

7^75 

20 

41-5 

17^25 

8^25 

1-25 

— 

— 

100 

-09276 



(■60) 

(2 -32) 

(8-81) 

(22 •OS) 

(27 •04.) 

(23 ^89) 

(11 -78) 

(3 -01) 

(•51) 




65 -5-67 -5 

2 

10 

32 

73 

78^75 

33 •S' 

7^25 

1 

— 

237^5 

-22032 

i 


(1 '43) 

(5 •SI) 

(20 -93) 

(52 -33) 

(64 ^22) 

(56 ^73) 

(27 -98) 

(7-16) 

(1 -21) 



OQ 

67 -5-69 -5 


4-5 

27-75 

65^5 

95 

93-25 

31-5 

4-5 

1 

323 

•29963 

■s 


(1-95) 

(7;49) 

(28 -46) 

(71 •IB) 

(87 ^34) 

(77-15) 

(38 -03) 

(9 -74) 

(1 -65) 



Cl> 

69 5-71 -5 



6^75 

38 ^25 

61 

77-5 

39-5 

11 

2 

236 

•21892 

.1^ 


(1 -42) 

(5 -47) 

(20 -80) 

(51 ^99) 

(63 ^82) 

(56 -37) 

(27 -80) 

(7 -11) 

(1 -20) 



J 

71 -5-73 -5 



■25 

5^75 

24^75 

34^5 

32-25 

7 

-5 

105 

-09740 

02 


(■63) 

(2 •44) 

(9-5) 

(23 •IS) 

(28 ^39) 

(25 -08) 

(12 -37) 

(3 -17) 

(-54) 




73 -5-75 -5 


— 

1 

3 

6^25 

6-75 

13 

5^5 

2 

37-5 

-03479 



(•23) 

(•87) 

(3 -31) 

(8 -26) 

(10 -14) 

(8 •96) 

(4 -42) 

(1 -IS) 

(-19) 




75 -5-77 -5 




— 

2 5 

1-5 

1-5 

2 5 

— 

8 

-00742 



(•05) 

(•19) 

(~70) 

(1 ^76) 

(2 -16) 

(1 '91) 

(-94) 

(•24) 

(-04) 




77 -5-79 -5 






2 

-5 

1 


3-5 

-00325 



(^2) 

(^8) 

(•31) 

(•77) 

(•95) 

(-84) 

(-41) 

(•11) 

(-02) 




Totals. . 

6 5 

25 

95 

1 
237 5 : 291 •S 

257-5 

127 

32^5 

5-5 

1078 

1 -00000 


To sho-w the effect of too fine a grouping, I worked out the mean contingency for 
the inch grouping in Table II. There resulted 

xji = -2309, giving C^ = -597. 

I therefore conclude that -with sufficiently fine grouping the new method of 
contingency will give contingency coefficients sensibly equal to the correlation 
coefficient. But that with over fine grouping, the effect of individual units scattered 
here and there at random over the table, becomes influential and exaggerates the 
value of the correlation. Hence, when a correlation table can be formed and worked 
in the old ways, there is little doubt that it is safer to do so, and* the labour will 
hardly be sensibly greater, at least when compared with the method of mean square 
contingency. I have not faced the labour required to determine the mean square 
contingency of the table -vnth 340 sub-groups. Dr. Lee has worked out the mean 
square contingency for a table with 400 sub-groups, and we do not think it desirable 
to deal with a table of more than 10^ to 15^ entries again. Still the mean square 
contingency coefficient will hardly be as great on the full table as the mean 
contingency coefficient. . 

The following table gives the results : — 
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Comparison of Methods of Finding Correlation. 


No. of ' Mean 
groupings. contingencj'. 

Mean square 
contingency. 

Fourfold 
division. 

Correlation 
table. 

42 1 -522 

90 , -542 

.340 -597 

■465 
•513 

(Mean of six divisions)* 
•550 

•514 


Thus the first contingency method approaches the fourfold, the second, the 
ordinary correlation method. 

Diagram II. at the end of this memoir gives the hyperbola of zero contingency for 
tliis case, calculated on the basis of the correlation coefficient being '514. The means 
and standard deviations are : — 


Father 

Son 


67"-698, 
68"'661, 


2"-7048, 
2"7321, 


and the equation to the hyperbola referred to the means as the origin is 

x^ — 3-8522ya; + ■gBOl?/^ = 6-2510. 

The shaded squares are those of positive contingency. It will be seen that the 
hyperbola separates fairly well areas of positive, from areas of negative contingency. 
In most cases where there is an invasion across the boundary, the contingencies 
hardly differ from zero by amounts greater than the probable errors due to random 
sampling. 


Illustration B. — Data from Colour Inheritance in Greyhounds. 

In the previous example we have dealt with material in which contingency methods 
were directly comparable as to result with the correlation found by the " best " or 
product method process. In this illustration I deal with matter which can only 
provide a correlation to be found by the fourfold division process for comparison with 
the contingency coefficients. The data from which this illustration is drawn were 
extracted by Miss A. Barrington from the ' Greyhound Studbook.' We deal with 
the inheritance of red and black pigments in the coat colour. I have selected six 
cases of the resemblance of brethren from different litters to compare the methods on 
Tables were formed giving 16 to 25 contingency sub-groups of varying degrees of 
pigment, and these were worked out (a) by Miss Barrington herself for the mean 
square contmgency, (6) by myself for the mean contingency, and (c) by Dr. A. Lee 

* See ' Phil. Trans.,' A, vol. 195, p. 42. The values range from -531 to -594, or almost the same range 
as we obtam from the mean contingency results. 
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for the fourfold correlation results. The results reached are given in the accompanying 
table. It is desirable to state that the number dealt with was about 1000 pairs of 
brethren in each case. 


Table V. — Fraternal Resemblance of Greyhounds from Different Litters. 


Character. 

Ci, Mean Square 
Contingency. 

C^, Mean 
Contingency. 

r, Fourfold Table. 

Red in brothers 

„ sisters 

„ sister and brother .... 

Black in brothers 

„ sisters 

„ sister and brother .... 

•478 
•528 
•488 
•512 
•482 
•502 

•695 
•612 
•615 
•615 
•632 
•622 

•456 
•620 
•450 
•558 
•552 
•593 

1 

Mean 

Mean deviation from mean . . 

■498 ' ^632 
•016 i ^032 

•538 
•057 


We see at once from this table that the method of mean square contingency gives 
far more uniform results than either the mean contingency method or the fourfold 
division method. The average given by it is close to what we have found for 
fraternal resemblance, i.e., '5, in other cases, and within fairly close limits, all six 
cases now give '5. The mean contingency gives results more divergent among 
themselves, but less so than those of the fourfold division method ; their average, 
however, diverges most from what we have found in other cases. 

The lesson, I think, to be learnt from this is : That the mean square contingency 
coefficient, although more laborious to find, is better than the mean contingency 
coefficient. That even with only 16 to 25 contingency sub-groups we may deduce 
results comparable with those obtained by fourfold divisions. But that it is probably 
always necessary to check a series by a certain number of fourfold division workings, 
for such are the only test that we have not got too crude a grouping reducing the 
contingency coefficient below the correlation value, or too fine a grouping introducing 
the difficulty already referred to (see p. 16), of magnifying the contingency coefficient 
owing to anomalous units. 


Illustration C. — Hair Colour in Man. 


I take the subject of hair colour because it is one in which doubts have been raised 
as to the order of pigments in a scale. 

The following table gives the resemblance of pairs of brothers in hair colour : — 
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Table VI. 




First Brother. 

Totals. 

Red. 

Fair. 

Brown. 

Dark. 

Jet Black. 

Second 
Brother. 

.. _ _ _ - - 1 

Red 

Fair 

Brown .... 

Dark 

Jet Black . . . 

30-5 

23 

16 

12 

23 
416 
158 

67-75 

-25 

16 

158 
394 
98-25 
8-25 

12 

67-75 
98-25 
328-5 
19 

•25 
8-25 
19 
10 

81-5 
665 
674-5 
525-5 

37-5 


Totals . . . 

81-5 

665 

674-5 

525-5 

37-5 

1984 


The correlation found by taking the mean of four four-fold table divisions was -621.* 
This result is based on the above scale order. We will now see what difference will 

arise if we work by contingency, so that the scale order is absolutely indifferent, e.g., 

red might follow jet black. 

We find 

<^2 = -603896, 

and accordingly C^ = '614, a result within the limits of the probable error identical 
with the value of r found from the four-fold division method. 

This illustration confirms the opinion I have already expressed, i.e., that if the 
contingency be calculated for 16 to 36 sub-groups we shall obtain by the method of 
mean square contingency satisfactory results j i.e., values close to the coefiicient of 
coiTelation as found by product moment or four-fold division methods. In this case, 
as in others, I find the mean contingency far inferior to the mean square contingency. 

My experience seems to show that about 25 sub-groups is the distribution to be 
aimed at ; 9 is too few. Thus I worked out the relationship of temper in sisters for 
three-fold division — sullen, good-tempered, quick-tempered — or for 9 sub-groups. 
The method of mean contingency gave '44 and of mean squared contingency '36. 
Both far too small, as I find from each of four four-fold divisions a result of about '5. 


Illustration D. — On Occupational or Professional Correlation between Relatives. 

I take as a final illustration a case in which any idea of scale is practically 
inconceivable, and yet one in which it is of considerable interest to measure the 
deviation from independent probability. It belongs to a class of problems in which I 
hope this new method of contingency will be fruitful of result. In classifying men 
into occupational and professional groups, we clearly cannot do so on the basis of any 

* "HiTxley Memorial Lecture," 'Journal of Anthropological Institute,' vol. 33, pp. 197 and 215. 
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scale which will put the army, church, and bar in any special order. On the other 
hand, it becomes of special interest to determine how far tastes and preferences for 
particular callings in life run in families. Miss Emily Perein has undertaken a 
lengthy investigation of this kind, and has provided me with the pure contingency 
table given as Table YII, The occupations of 775 fathers and sons are here classed 
in broad general groups, which can be arranged purely alphabetically. More minute 
divisions and data for other series of relatives will be published later by Miss Perrin, 
and it is not my present purpose to anticipate her conclusions, but merely to suggest 
the valuable applications which may be made of the novel methods to pure 
contingency results. What is the numerical measure of the relationship in pursuit 
between father and son, and how far is it removed from a mere chance relationship ? 


Table VII. — Contingency between Occupations of Fathers and Sons. 








Occupation of Son. 






CO 

o 

Nature of occupation. 

>. 

■+3 

§3 

go 

i 

1 

CO 

so 

■i 

o 

03 

1 

<6 

i 

o 

1 

> 

1 



■^ 

< 

H 

O 

ft 

< 

tS 

hJ 

k5 

O 

S 

;2i 

^ 

xn 

H 


Army .... 

28 


4 




1 

3 

3 


3 

1 

5 
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2 

51 

1 

1 

2 

. 

— 

1 

2 

— 

— 

— 

1 

1 
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6 

5 

7 

— 

9 

1 

3 

6 

4 

2 

1 

1 

2 
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54 

fe 

Crafts . . . 

— 

12 
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5 
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1 

7 

1 
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10 

44 

^ 

Divinity . . . 

5 

5 

2 

1 

54 

— 

— 

6 

9 

4 

12 

3 

1 

13 

115 


Agriculture . . 

— 

2 

3 

— 

3 

— 

— 

1 

4 

1 

4 

2 

1 

5 

26 


Landownership 

17 

1 

4 

— 

14 

— 

6 

11 
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3 

3 
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7 

88 

fl 

Law .... 

3 

5 

6 

— 

6 

. — 

2 

18 

13 

1 

1 

1 

8 

5 

69 

•J 

Literature . . 

— 

1 

1 

— 

4 
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— 

1 

4 

. — 

2 

1 

1 

4 

19 

1 

Commerce . . 

12 

16 

4 

1 

15 

— 

— 

5 

13 

11 

6 

1 

7 

15 
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Medicine . . . 

— 

4 

2 

— 

1 

— 

— 

— 

3 

— 

20 

— 

5 

6 

41 
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Navy .... 

1 

3 

1 

— 
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— 

1 

— 

1 

1 

1 

6 

2 

1 
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Court . . J 
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and Science j 

5 

— 

2 

— 

3 

— 

1 

8 

1 

2 

2 

3 

23 

1 

51 


5 

3 

— 

2 

6 

— 

1 

3 

1 

— 

— 

1 

1 

9 

32 


Totals . . 

84 

108 

37 

11 

12.2 

1. 

15 

64 

69 

24 

57 

23 

74 

86 

775 


Miss Perrin has extracted this first series from the ' Dictionary of National 
Biography' ; hence she has, as a rule, tabled the distinguished, or at least moderately 
distinguished, sons of less distinguished fathers. It is, for example, not easy to win 
any form of distinction in agriculture. For this reason the distribution of occupations 
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for sons differs widely from that of the occupations for fathers. There has accord- 
ingly been selection of the second generation, which undoubtedly must influence the 
result, i.e., tend to weaken the observed relationship. 

Working out the 196 contingencies, squaring, dividing by the independent 
probability frequencies, summing and averaging, I find for the mean square 

contingency 

(^3=r: 1-299206, 

whence 

<^V(1 + <t>^) = -393794, 

and the coefficient of mean square contingency = -6275. This would correspond to 
the correlation in occupation between father and son. Now if occupation were settled 
solely by fitness or taste, and these characters were inherited as other human faculties, 
we should expect the correlation, between father and son to be about '46.* Or, 
roughly, the hereditary relationship is increased by about ^ in the matter of 
occupation. Remembering Avhat we have noted as to selection above, the real 
increment is probably somewhat larger than this. Roughly, however, we may 
conclude from Miss Perrin's data that about f of the observed resemblance in 
occupation between father and son is due to hereditary influences, and the remaining 
^ to environmental effect. These numbers are subject to revision when Miss Perrin's 
data are more ample and have been more fully analysed and discussed. 

(12.) General Conclusions. 

The general conception of contingency developed in this memoir I consider in the 
first place of theoretical importance. Its practical applications are not negligible, but 
are, for reasons given below, of less importance than might a priori be supposed. 

(a.) In the first place, the conception of contingency enables us at once to generalise 
the notion of the association of two attributes developed by Mr. Yule. "We can class 
mdividuals not into two alternate groups, but into as many groups with exclusive 
attributes as we please, and either the mean contingency or the mean square 
contingency Avill enable us to see the extent to which two such systems are contingent 
or non-contingent. 

(b.) This result enables us to start from the mathematical theory of independent 
probabihty as developed in the elementary text books, and build up from it a 
generalised theory of association, or, as I term it, contingency. We reach the notion 
of a pure contingency table, in which the order of the sub-groups is of no importance 
whatevei'. 

(c.) We then investigate the relation of contingency to normal correlation, and 
find that with normal frequency distributions both contingency coefficients pass with 
sufficiently fine grouping into the well-known correlation coefficient. Since, however, 

* ' Biometrika,' vol. 2, p. 379. 
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the contingency is independent of the order of grouping, we conclude that when we 
are deahng with alternative and exclusive sub-attributes, we need not insist on the 
importance of any particular order or scale for the arrangement of the sub-groups. 

(d.) This conception can be extended from normal correlation to any distribution 
with linear regression; small changes {i.e., such that the sum of their squares may 
be neglected as compared with the square of mean or standard deviation) may be 
made in the order of grouping without affecting the correlation coefficient. 

(e.)^ The results (c) and (d) are not so fruitful for practical working as might at 
first sight appear, for they depend in practice on the legitimacy of replacing finite 
mtegrals by sums over a series of varying areas, where no quadrature formula is 
available. If we, to meet the difficulty, make a very great number of small classes, 
the calculation, especially of the mean square contingency, becomes excessively 
laborious. Further, since in observation individuals go by units, casual individuals, 
which may fairly represent the total frequency of a considerable area, will be found 
on some one or other isolated small area, and thus increase out of all proportion the 
contingency. The like difficulty occurs when we deal with outlying individuals in 
the case of frequency curves, only it is immensely exaggerated in the case of 
frequency surfaces. 

(f.) It is thus not desirable in actual practice to take too many or too fine sub- 
groupings. It is found, under these conditions, that the correlation coefficient as 
determined by the product moment or fourfold division methods is approximated to 
more closely in the case of the contingency coefficient found from mean square 
contingency than in the case of that found from mean contingency. Probably 
16 to 25 contingency sub-groups will give fairly good results in the case of mean 
square contingency, but for each particular type of investigation it appears desirable 
to check the number of groups proper for the purpose by comparison with the results 
of test fourfold division correlations. Under such conditions it appears likely that 
very steady and consistent results will be obtained from mean square contingency. 

(g.) Finally, contingency may be applied — of course, at first tentatively and with 
caution — in the consideration of a whole class of problems in which no attempt at a 
scale or order of sub-groups is possible, in short, whe*re alphabetical order is as good 
as any other. For example, it would seem to be available in a vast range of problems 
of exclusive and alternative inheritance. 
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